Speech analysis and recognition using interval statistics generated from a composite auditory model
نویسندگان
چکیده
A modeling approach to auditory speech analysis and recognition is proposed and evaluated, where a composite auditory model is used to generate parallel sets of auditory-nerve instantaneous firing rates (IFR’s) along the spatial dimension, followed by a processing stage that constructs from the IFR’s an interval statistics in a form called the interpeak interval histogram (IPIH). A speech preprocessor is designed that performs transformation on the auditory IPIH’s and interfaces the IPIH-based auditory representation with a hidden Markov model-based (HMM-based) speech recognizer. The results demonstrate that the new preprocessor consistently outperforms the conventional mel frequency cepstral coefficient-based (MFCC-based) preprocessor for the signal-tonoise ratio (SNR) level up to at least 16 dB.
منابع مشابه
Correlation between Auditory Spectral Resolution and Speech Perception in Children with Cochlear Implants
Background: Variability in speech performance is a major concern for children with cochlear implants (CIs). Spectral resolution is an important acoustic component in speech perception. Considerable variability and limitations of spectral resolution in children with CIs may lead to individual differences in speech performance. The aim of this study was to assess the correlation between auditory ...
متن کاملUsing Eye Movement Analysis to Study Auditory Effects on Visual Memory Recall
Recent studies in affective computing are focused on sensing human cognitive context using biosignals. In this study, electrooculography (EOG) was utilized to investigate memory recall accessibility via eye movement patterns. 12 subjects were participated in our experiment wherein pictures from four categories were presented. Each category contained nine pictures of which three were presented t...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملThe Representation of Speech in a Nonlinear Auditory Model: Time-Domain Analysis of Simulated Auditory-Nerve Firing Patterns
A nonlinear auditory model is appraised in terms of its ability to encode speech formant frequencies in the fine time structure of its output. It is demonstrated that groups of model auditory nerve (AN) fibres with similar interpeak intervals accurately encode the resonances of synthetic three-formant syllables, in close agreement with physiological data. Acoustic features are derived from the ...
متن کاملمدل میکروسکوپی دوگوشی مبتنی بر فیلتر بانک مدولاسیون برای پیش گویی قابلیت فهم گفتار در افراد دارای شنوایی عادی
In this study, a binaural microscopic model for the prediction of speech intelligibility based on the modulation filter bank is introduced. So far, the spectral criteria such as the STI and SII or other analytical methods have been used in the binaural models to determine the binaural intelligibility. In the proposed model, unlike all models of binaural intelligibility prediction, an automatic ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Speech and Audio Processing
دوره 6 شماره
صفحات -
تاریخ انتشار 1998